computational learning theory
- Europe > Italy > Lombardy > Milan (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Erzeugunsgrad, VC-Dimension and Neural Networks with rational activation function
Pardo, Luis Miguel, Sebastián, Daniel
The notion of Erzeugungsgrad was introduced by Joos Heintz in 1983 to bound the number of non-empty cells occurring after a process of quantifier elimination. We extend this notion and the combinatorial bounds of Theorem 2 in Heintz (1983) using the degree for constructible sets defined in Pardo-Sebastián (2022). We show that the Erzeugungsgrad is the key ingredient to connect affine Intersection Theory over algebraically closed fields and the VC-Theory of Computational Learning Theory for families of classifiers given by parameterized families of constructible sets. In particular, we prove that the VC-dimension and the Krull dimension are linearly related up to logarithmic factors based on Intersection Theory. Using this relation, we study the density of correct test sequences in evasive varieties. We apply these ideas to analyze parameterized families of neural networks with rational activation function.
- Europe > Spain > Cantabria > Santander (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > New York (0.04)
- (4 more...)
Collaborative PAC Learning
Avrim Blum, Nika Haghtalab, Ariel D. Procaccia, Mingda Qiao
We consider a collaborative PAC learning model, in which k players attempt to learn the same underlying concept. We ask how much more information is required to learn an accurate classifier for all players simultaneously. We refer to the ratio between the sample complexity of collaborative PAC learning and its non-collaborative (single-player) counterpart as the overhead.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
An Information-Theoretic Approach to Generalization Theory
Rodríguez-Gálvez, Borja, Thobaben, Ragnar, Skoglund, Mikael
We investigate the in-distribution generalization of machine learning algorithms. We depart from traditional complexity-based approaches by analyzing information-theoretic bounds that quantify the dependence between a learning algorithm and the training data. We consider two categories of generalization guarantees: 1) Guarantees in expectation: These bounds measure performance in the average case. Here, the dependence between the algorithm and the data is often captured by information measures. While these measures offer an intuitive interpretation, they overlook the geometry of the algorithm's hypothesis class. Here, we introduce bounds using the Wasserstein distance to incorporate geometry, and a structured, systematic method to derive bounds capturing the dependence between the algorithm and an individual datum, and between the algorithm and subsets of the training data. 2) PAC-Bayesian guarantees: These bounds measure the performance level with high probability. Here, the dependence between the algorithm and the data is often measured by the relative entropy. We establish connections between the Seeger--Langford and Catoni's bounds, revealing that the former is optimized by the Gibbs posterior. We introduce novel, tighter bounds for various types of loss functions. To achieve this, we introduce a new technique to optimize parameters in probabilistic statements. To study the limitations of these approaches, we present a counter-example where most of the information-theoretic bounds fail while traditional approaches do not. Finally, we explore the relationship between privacy and generalization. We show that algorithms with a bounded maximal leakage generalize. For discrete data, we derive new bounds for differentially private algorithms that guarantee generalization even with a constant privacy parameter, which is in contrast to previous bounds in the literature.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
- Research Report > New Finding (1.00)
- Overview (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- (5 more...)